Simultaneous analysis of contours and durations

Landmark Registration

Michele Gubian

July 1, 2024

f0 contours and (syllable) boundaries

f0 contours and (syllable) boundaries

Modelling misaligned contours

  • Anchor across-curves variation to meaningful boundaries (landmarks)
  • Separate variation due to:
    1. boundary misalignment
    2. shape variation anchored to boundaries
  • Boundaries may be intrinsic, e.g. peaks
  • or extrinsic, e.g. syllable boundaries on a f0 contour
  • Analyse both sources of variation jointly

General strategy

  1. Pre-process contours’ time axis
  2. Apply FPCA + L(ME)R + Reconstruction

Four approaches to time axis pre-processing

  1. Original time axis (i.e. do nothing)
  2. Linear time normalisation
  3. Landmark registration
  4. Landmark registration + time warping curves

1. Original time axis

Apply FPCA

PC scores vs. Category

Fit a regression model on s1

Comments

  • Predicted curves do not show duration variation
  • Landmarks are missing

2. Linear time normalisation

Linear time normalisation

Before

After

Linear time normalisation

Before

After

Apply FPCA to linearly time-normalised curves

PC scores vs. Category

Fit a regression model on s1

Comments

  • Curve shape distorted
  • Predicted curves do not show duration variation
  • Landmarks are missing

3. Landmark registration

Landmark registration

Before

After

Landmark registration

Before

After

Landmark registration

  • Aligns all curves to the given landmarks
  • Curves are smoothly distorted
  • Depends on landmarks given by the user
  • Does not align curves based on their shapes (like DTW)

Procedure overview

Apply FPCA to landmark-registered curves

PC scores vs. Category

Fit a regression model on s1

Comments

  • Curve shape distorted
  • Predicted curves do not show duration variation
  • Landmarks are preserved
    • we know where they are
    • we interpret curve shapes against landmark position

Landmark registration + time warping curves

Inside landmark registration

Original curves + landmarks

Registered curves

Inside landmark registration

Original curves + landmarks

Time warping curves \(h(t)\)

Registered curves

Time warping curves

\(h(t)\)

Equivalent representations (1)

Original curves + landmarks

Equivalent representations (2)

Registered curves

\(h(t)\)

Time warping curves as log rates

\(- log \frac{dh(t)}{dt}\)

Procedure

  • Compute landmark registration
  • Pull out time warping curves \(h(t)\)
  • Transform them into log rates \(r(t) = - log \frac{dh(t)}{dt}\)
  • Compose 2-dimensional curves from
    1. Landmark registered curves
    2. Respective log rates
  • Apply multidimensional FPCA
  • Apply linear regression on a PC score
  • Reverse log rate predicted curves to inter-landmark durations

Procedure overview (1)

Procedure overview (2)

Apply 2-dim FPCA to curves + log rates

Curve dimension

Log rate dimension

Apply 2-dim FPCA to curves + log rates

Curve dimension

Duration dimension

Apply 2-dim FPCA to curves + log rates

PC scores vs. Category

Fit a regression model on s1

Fit a regression model on s1

Comments

  • Landmarks preserved
  • FPCA captures co-variation of shape and inter-landmark durations
  • Regression model predicts both curve shape and inter-landmark durations
  • No information loss
  • Maths details in Gubian, Boves, and Cangemi (2011) and appendix A in Asano and Gubian (2018)

References

Asano, Yuki, and Michele Gubian. 2018. ‘Excuse Meeee!!’:(mis) Coordination of Lexical and Paralinguistic Prosody in L2 Hyperarticulation.” Speech Communication 99: 183–200.
Asano, Yuki, Michele Gubian, and Sacha Dominik. 2016. “Cutting down on Manual Pitch Contour Annotation Using Data Modelling.”
Cronenberg, Johanna, Michele Gubian, Jonathan Harrington, and Hanna Ruch. 2020. “A Dynamic Model of the Change from Pre-to Post-Aspiration in Andalusian Spanish.” Journal of Phonetics 83: 101016.
El Zarka, Dina, Anneliese Kelterer, Michele Gubian, and Barbara Schuppler. 2024. “The Prosody of Theme, Rheme and Focus in Egyptian Arabic: A Quantitative Investigation of Tunes, Configurations and Speaker Variability.” Speech Communication, 103082.
Gubian, Michele, Lou Boves, and Francesco Cangemi. 2011. “Joint Analysis of f 0 and Speech Rate with Functional Data Analysis.” In 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 4972–75. IEEE.
Gubian, Michele, Francesco Cangemi, and Lou Boves. 2010. “Automatic and Data Driven Pitch Contour Manipulation with Functional Data Analysis.”
Gubian, Michele, Jonathan Harrington, Mary Stevens, Florian Schiel, and Paul Warren. 2019. “Tracking the New Zealand English near/square Merger Using Functional Principal Components Analysis.” In Proceedings of the 20th Annual Conference of the International Speech Communication Association, 296–300. Graz. https://doi.org/10.21437/Interspeech.2019-2115.
Gubian, Michele, Manfred Pastätter, and Marianne Pouplier. 2019. “Zooming in on Spatiotemporal v-to-c Coarticulation with Functional PCA.” In INTERSPEECH, 889–93.
Gubian, Michele, Francisco Torreira, and Lou Boves. 2015. “Using Functional Data Analysis for Investigating Multidimensional Dynamic Phonetic Contrasts.” Journal of Phonetics 49: 16–40. https://doi.org/10.1016/j.wocn.2014.10.001.
Ramsay, J., and B. Silverman. 2005. Functional Data Analysis. Springer Series in Statistics.